
ELSEVIER Virus Research 60 (1999) 181-189 


Virus 

Research 


Phylogenetic analysis of a highly conserved region of the 
polymerase gene from 11 coronaviruses and development of a 
consensus polymerase chain reaction assay 

Charles B. Stephensen a, * J , Donald B. Casebolt b ’ 2 , Nupur N. Gangopadhyay a ’ 3 

a Department of International Health, School of Public Health, University of Alabama at Birmingham, Birmingham, 

AL 35294, USA 

b Department of Comparative Medicine, University of Alabama at Birmingham, Birmingham, AL 95616, USA 


Received 16 November 1998; received in revised form 9 February 1999; accepted 12 February 1999 


Abstract 

Viruses in the genus Coronavirus are currently placed in three groups based on antigenic cross-reactivity and 
sequence analysis of structural protein genes. Consensus polymerase chain reaction (PCR) primers were used to 
obtain cDNA, then cloned and sequenced a highly conserved 922 nucleotide region in open reading frame (ORF) lb 
of the polymerase (pol ) gene from eight coronaviruses. These sequences were compared with published sequences for 
three additional coronaviruses. In this comparison, it was found that nucleotide substitution frequencies (per 100 
nucleotides) varied from 46.40 to 50.13 when viruses were compared among the traditional coronavirus groups and, 
with one exception (the human coronavirus (HCV) 229E), varied from 2.54 to 15.89 when compared within these 
groups. (The substitution frequency for 229E, as compared to other members of the same group, varied from 35.37 
to 35.72.) Phylogenetic analysis of these pol gene sequences resulted in groupings which correspond closely with the 
previously described groupings, including recent data which places the two avian coronaviruses—infectious bronchitis 
virus (IBV) of chickens and turkey coronavirus (TCV)—in the same group [Guy, J.S., Barnes, H.J., Smith L.G., 
Breslin, J., 1997. Avian Dis. 41:583-590]. A single pair of degenerate primers was identified which amplify a 251 bp 
region from coronaviruses of all three groups using the same reaction conditions. This consensus PCR assay for the 
genus Coronavirus may be useful in identifying as yet unknown coronaviruses. © 1999 Elsevier Science B.V. All rights 
reserved. 
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The genus Coronavirus is in the family Coro- 
naviridae in the order Nidovirales (Cavanagh, 
1997). Viruses of this order have linear, non-seg- 
mented, positive-sense, single-stranded RNA 
genomes with similar genomic organization and a 
nested set of subgenomic rnRNAs. Members of 
the genus Coronavirus infect birds and mammals, 
causing respiratory, enteric, cardiovascular and 
neurological disease (Holmes and Lai, 1996). The 
coronaviruses were originally divided into three 
groups based on antigenic relatedness of the struc¬ 
tural proteins (Sturman and Holmes, 1983; Sid- 
dell, 1995), which include the 
haemagglutinin-esterase (HE), spike (S), integral 
membrane (M) and nucleocapsid (N) proteins. 
Genes encoding these proteins are clustered at the 
3' end of the 27-31 kb coronavirus genome. 
However, the most highly conserved genomic se¬ 
quences are found in the 20 kb polymerase (pol) 
gene, which covers the 5' two-thirds of the coro¬ 
navirus genome (Snijder and Spaan, 1995). The 
pol gene contains two large open reading frames 


(ORFs), ORF la and ORF lb. Within ORF lb, 
there are very highly conserved regions encoding 
conserved functions (e.g. polymerase and helicase 
activity) which, combined with similarities in 
replication and expression strategies, demonstrate 
an evolutionary link among coronaviruses, ar- 
teriviruses, and toroviruses. These similarities 
form the rationale for placing these viruses in the 
order Nidovirales (Snijder et ah, 1990, 1991; den 
Boon et al., 1991; Godeny et ah, 1993; Snijder 
and Spaan, 1995; Cavanagh, 1997). Therefore, the 
highly conserved structure and function of viral 
polymerases make the pol a logical region for 
making phylogenetic comparisons, as well as for 
developing a consensus polymerase chain reaction 
(PCR) assay which could be used for the identifi¬ 
cation of novel coronaviruses. This strategy has 
been used with other viruses, particularly papillo¬ 
maviruses (Bernard et ah, 1994; Astori et al., 
1997). Such an assay would be useful because 
possible novel coronaviruses have been tentatively 
identified (e.g. using electron microscopy) in asso- 


Table 1 

Amino acid ( italics ; on top, right-hand side) and nucleotide substitution rates (per 100 residues) in a highly conserved region of open 
reading frame (ORF) lb of the pol gene of 11 coronaviruses 



HEV 

BCV 

OC43 

MHV 

SDAV 

IBV 

TCV 

FIPV 

TGEV 

CCV 

229E 

HEV 


0.98 

2.99 

8.27 

7.19 

38.38 

40.56 

44.90 

45.48 

44.32 

48.45 

BCV 

2.54 


1.98 

8.27 

7.19 

38.38 

40.56 

43.75 

44.32 

43.18 

48.45 

OC43 

3.21 

3.33 


9.01 

7.91 

38.38 

40.56 

44.90 

44.90 

44.32 

48.45 

MHV 

15.89 

15.09 

15.35 


0.98 

38.92 

41.11 

47.85 

47.85 

47.85 

50.28 

SDAV 

14.16 

13.51 

14.03 

4.12 


37.84 

40.01 

46.65 

46.65 

46.65 

49.05 

IBV 

49.50 

49.08 

48.25 

49.50 

47.01 


3.00 

45.68 

46.27 

45.68 

47.46 

TCV 

49.92 

49.92 

48.66 

50.13 

48.04 

7.19 


47.46 

48.06 

47.46 

49.28 

FIPV 

50.11 

49.06 

49.90 

49.69 

47.82 

46.40 

48.25 


0.33 

0.33 

24.03 

TGEV 

49.06 

48.85 

48.65 

49.27 

48.03 

47.63 

48.87 

3.33 


0.65 

24.48 

CCV 

48.23 

47.82 

48.03 

49.27 

47.00 

47.22 

49.92 

4.01 

4.58 


24.48 

229E 

49.48 

48.44 

48.44 

50.54 

49.69 

48.87 

48.04 

35.37 

35.72 

35.54 



Fig. 1. Deduced amino acid sequence of the polymerase motif region from open reading frame (ORF) lb of the pol gene of 11 
coronaviruses. The mouse hepatitis virus (MHV), infectious bronchitis virus (IBV) and 229E sequences are derived from published 
sequences (see text). The first amino acid in this figure corresponds to amino acid 466 of the IBV ORF lb (Lee et al., 1991). Amino 
acids 79 through 307 correspond to 228 of the 258 amino acids representing the conserved polymerase motif common to 
coronaviruses, toroviruses and arteriviruses (see Fig. 5 in den Boon et al., 1991). The highly conserved SDD or GDD polymerase 
motif (Poch et al., 1989) is identified by asterisks. Capitalized letters indicate amino acids which are conserved in all 11 sequences. 
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HEV 

BCV 

OC43 

MHV 

SDAV 

IBV 

TCV 

FIPV 

TGEV 

CCV 

229E 

Consensus 


x 21 41 

fNKFGKArLYYEa1s fEEQD eiYayTKrNVLPTlTQMNLK YAISaKnRARTVaGVSiLsT 
f NKFGKArLYYEal s f EEQD ei YayTKrNVLPTlTQMNLK YAISaKnRARTVaGVSiLsT 
fNKFGKArLYYEalsfEEQD eiYayTKrNVLPTlTQMNLK YAISaKnRARTVaGVSiLsT 
fNKFGKArLYYEalsfEEQD eiYayTKrNVLPTlTQMNLK YAISaKnRARTVaGVSiLsT 
fNKFGKArLYYEa1sfEEQD eiYayTKrNVLPTlTQMNLK YAISaKnRARTVaGVSiLsT 

fNKFGKArLYYErasl. EEQD qlFeiTKkNVLPTiTQMNLK YAISaKnRARTVaGVSiLsT 
fNKFGKArLYYEmsl.EEQD qlFesTKkNVLPTiTQMNLK YAISaKnRARTVaGVSiLsT 

INKFGKArLYYEtlsyEEQD alFalTKrNVLPTmTQMNLK YAISgKaRARTVgGVSlLsT 
INKFGKArLYYEtlsyEEQD alFalTKrNVLPTmTQMNLK YAISgKaRARTVgGVSlLsT 
INKFGKArLYYEtlsyEEQD alFalTKrNVLPTmTQMNLK YAISgKaRARTVgGVSlLsT 
lNKFGKAgLYYEsisyEEQD aiFslTKrNILPTmTQLNLK YAISgKeRARTVgGVSlLaT 


61 80 
MTgRmFHQKcLKSlaaTRgv 
MTgRmFHQKcLKSIaaTRgv 
MTgRmFHQKcLKSIaaTRgv 
MTgRmFHQKcLKSlaaTRgv 
MTgRmFHQKcLKSlaaTRgv 

MTnRqFHQKiLKSIvnTRna 
MTnRqFHQKiLKSIvnTRna 

MTtRqYHQKhLKSIaaTRna 
MTtRqYHQKhLKSIaaTRna 
MTtRqYHQKhLKSIaaTRna 
MTtRqFHQKcLKSIvaTRna 


NKFGKA-LYYE-EEQD --F--TK-NVLPT-TQMNLK YAIS-K-RARTV-GVS-L-T MT-R-FHQK-LKSI --TR 


HEV 

BCV 

OC43 

MHV 

SDAV 

IBV 

TCV 

FIPV 

TGEV 

CCV 

229E 

Consensus 


HEV 

BCV 

OC43 

MHV 

SDAV 

IBV 

TCV 

FIPV 

TGEV 

CCV 

229E 

Consensus 


HEV 

BCV 

OC43 

MHV 

SDAV 

IBV 

TCV 

FIPV 

TGEV 

CCV 

229E 

Consensus 


81 81 

pWIGtTRFYGGWDdMLrrL ikdVDnpvLMGWDYPKCDRA 
pWIGtTKFYGGWDdMLrrL i kdVDnpvLMGWDYPKCDRA 
pWIGtTKFYGGWDdMLrrL ikdVDnpvLMGWDYPKCDRA 
pWIGtTRFYGGWDdMLrrL ikdVDspvLMGWDYPKCDRA 
pWIGtTKFYGGWDdMLrrL ikdVDspvLMGWDYPKCDRA 

sWIGtTKFYGGWDnMLrnL iqgVEdpiLMGWDYPKCDRA 
pWIGtTKFYGGWDnMLrnL i qgVEdp i LMGWDYPKCDRA 

tWIGsTKFYGGWDnMLknL mrdVDngcLMGWDYPKCDRA 
tWIGsTKFYGGWDnMLknL mrdVDngcLMGWDYPKCDRA 
tWIGsTKFYGGWDnMLknL mrdVDngcLMGWDYPKCDRA 
tWIGtTKFYGGWDnMLknL madVDdpkLMGWDYPKCDRA 

-WIG-TKFYGGWD-ML--L-VD LMGWD YPKCDRA 

161 181 
iVmcgGcyYvKPGGTsSGDa TTAFANSvFNIcQAvSaNVc 
iVmcgGcyYvKPGGTsSGDa TTAFANSvFNIcQAvSaNVc 
iVmcgGcyYvKPGGTsSGDa TTAFANSvFNIcQAvSaNVc 
iVmcgGcyYvKPGGTsSGDa TTAFANSvFNIcQAvSaNVc 
iVmcgGcyYvKPGGTsSGDa TTAFANSvFNIcQAvSaNVc 

tViatGgiYvKPGGTsSGDa TTAYANSvFNIiQAtSaNVa 
tVlatGgiYvKPGGTsSGDa TTAYANSvFNIiQAtSaNVa 

vVhctGgfYfKPGGTtSGDg TTAYANSaFNIfQAvSaNVn 
vVhctGgfYfKPGGTtSGDg TTAYANSaFNIfQAvSaNVn 
WhctGgf Yf KPGGTtSGDg TTAYANSaFNIfQAvSaNVn 
vVysnGgfYfKPGGTtSGDa TTAYANSvFNIfQAvSsNIn 

-V---G--Y-KPGGT-SGD- TTAYANS-FNI-QA-S-NV- 

241 261 

YeFLnKhFSMmILsDDgWC YdsdYAskGylAnlsaFqqv 
YeFLnKhFSMmILsDDgWC YnsdYAskGylAnlsaFqqv 
YeFLnKhFSMmILsDDgWC YnsdYAskGylAnlsaFqqv 
YeFLnKhFSMiILsDDgWC YnseFAskGylAnlsdFqqv 
YeFLnKhFSMmILsDDgWC YnseFAskGylAnlsaFqqv 

YsYLcKnFSLmILsDDgWC YnntLAkqGIVAdlsgFrev 
YsYLcKnFSLmIFaDDgWC YnntLAkqGIVAdlsgFrei 

FsYLrKhFSMmILsDDgWC YnkdYAdlGyVAdlnaFkat 
FsYLrKhFSMmILsDDgWC YnkdYAdlGyVAdlnaFkat 
FsYLrKhFSMmILsDDgWC YnkdYAdlGyVAdlnaFkat 
YgYLqKhFSMmILsDDsWC YnktYAglGylAdlsaFkat 

Y-YL-K-FSM-IL-DD-WC Y YA--G-IA-I--F 


121 141 160 

MPnilRivssLVLarKHeaC CsqsdrfYRLaNEcAQVLsE 
MPnilRivssLVLarKHeaC CsqsdrfYRLaNEcAQVLsE 
MPnllRivssLVLarKHetC CsqrtrfYRLaNEcAQVLsE 
MPnilRivssLVLarKHdsC CshtdrfYRLaNEcAQVLgE 
MPnilRivssLVLarKHdsC CshtdrfYRLaNEcAQVLsE 

MPnllRiaasLVLarKHtnC CswseriYRLyNEcAQVLsE 
MPnllRitasLVLarKHtnC CtwseriYRLyNEcAQVLsE 

LPnmiRmasaMILgsKHvgC CthsdrfYRLsNElAQVLtE 
LPnmiRmasaMILgsKHvgC CthndrfYRLsNElAQVLtE 
LPnmiRmasaMILgsKHvgC CthsdrfYRLsNElAQVLtE 
MPsmiRmlsaMILgsKHvtC CtasakfYRLsNElAQVLtE 

MP R-LVL--KH--C C-YRL-NE-AQVL-E 

201 221 240 
alMscngnkiedlsIralQk rlYshvYRndmvDstFVteY 
alMscngnkiedlsIralQk rlYshvYRsdmvDstFVteY 
alMscngnkiedlsIralQk rlYshvYRsdkvDstFVteY 
slMacnghkiedlsIrelQk rlYsnvYRadhvDpaFVseY 
slMacnghkiedlsIrelQk rlYsnvYRadhvDpaFVseY 

rlLsvitrdivydnlkslQy elYqqvYRrvnfDpaFVekF 
rlLsvitrdivyddlkslQy elYqqvYRrvnfDpaFVekF 

klLgvdsnacnnvtVksiQr kiYdncYRsssiDeeFVveY 
klLgvdsnacnnvtVksiQr kiYdncYRsssiDeeFVveY 
klLgvdsnacnnvtVksiQr kiYdncYRsssiDeeFVveY 
cvLsvnssncnnfnVkklQr qlYdncYRnsnvDesFVddF 

--L I- Q- --y yr-D — FV—Y 

281 307 
LYYQNnVFMsesKCWvEnDi nkGPHEF 
LYYQNnVFMsesKCWvEnDi nnGPHEF 
LYYQNnVFMsesKCWvEhDi nnGPHEF 
LYYQNnVFMseaKCWvEtDi ekGPHEF 
LYYQNnVFMseaKCWvEtDi ekGPHEF 

LYYQNnVFMadsKCWvEpDl ekGPHEF 
LYYQNnVYMadsKCWvEpDl ekGPHEF 

LYYQNnVFMs t s KCWvEpDl svGPHEF 
LYYQNnVFM s t s KCWvEpD1 svGPHEF 
LYYQNnVFMstsKCWvEpDl nvGPHEF 
LYYQNgVFMstaKCWtEeDl siGPHEF 

LYYQN-VFM-KCW-E-D- --GPHEF 


Fig. 1. ( Continued ) 
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SDAV 



HEV 


Fig. 2. Unrooted dendogram showing Kimura’s distances (rep¬ 
resented by branch lengths) for cDNA sequences from a 922 
nucleotide region of open reading frame (ORF) lb of the pol 
gene of 11 coronaviruses (see text for details). Numbers repre¬ 
sent the results of a bootstrap analysis and indicate the 
number of times out of 100 iterations that these branch points 
were identified. Sequence for the eight coronavirus sequences 
reported here is available from GenBank under the following 
accession numbers: bovine coronavirus (BCV), AF124985; ca¬ 
nine coronavirus (CCV), AF124986; feline infectious peritoni¬ 
tis virus (FIPV), AF124987; hemagglutinating encephalo¬ 
myelitis virus of swine (HEV), AF124988; OC43, AF124989; 
sialodacryoadenitis virus of rats (SDAV), AF124990; turkey 
coronavirus (TCV), AF 124991; transmissible gastroenteritis 
virus (TGEV), AF124992. 

ciation with a variety of human and animal dis¬ 
eases, but further characterization and definitive 
identification of these agents as coronaviruses has 
been difficult (Resta et al., 1985; Myint, 1995; 
Guy et al., 1997). 

For these reasons a highly conserved 922 nucle¬ 
otide region in ORF lb of the pol gene of eight 
coronaviruses were recently cloned and sequenced 
using consensus PCR primers. This region has 
previously been completely sequenced for two 
group 1 viruses, human coronavirus (HCV)-229E 
(Herold et al., 1993) and transmissible gastroen- 

Fig. 3. cDNA sequences for a subregion of the 922 nucleotides 
from open reading frame (ORF) lb of the pol gene used for 
the analysis shown in Fig. 2. (Nucleotide number 1 of this 922 
nucleotide-long region corresponds to nucleotide number 
13 853 in the infectious bronchitis virus (IBV) pol sequence 
(Boursnell et al., 1987). This Figure shows nucleotides number 
101 through 400. The regions targeted by the two degenerate 
primers (CV2Bp and CV4Bm, see text for sequence) used in 
the consensus polymerase chain reaction (PCR) assay for the 
genus Coronavirus are underlined. 


101 150 

HEV cTtACTCAaa TgAATtTgAA ATAtGCtATt agtGccAAga ataGaGCcCG 
BCV cTtACTCAaa TgAATtTgAA ATAtGCtATt agtGctAAga ataGaGCcCG 
OC43 cTtACTCAaa TgAATtTgAA ATAtGCtATt agtGctAAga ataGaGCcCG 
MHV cTaACTCAaa TgAATcTtAA ATAtGCtATt agtGctAAga ataGgGCcCG 
SDAV tTaACTCAaa TgAATcTtAA ATAtGCtATt agtGctAAga ataGaGCcCG 

IBV aTaACTCAaa TgAATtTaAA ATAtGCcATa tccGcgAAaa ataGaGCgCG 
TCV aTaACTCAaa TgAATtTaAA ATAtGCcATa tccGcgAAaa ataGaGCgCG 


FIPV 

TGEV 

CCV 

229E 

Consensus 


HEV 

BCV 

OC43 

MHV 

SDAV 


aTgACTCAaa TgAATtTgAA ATAtGCtATt tctGgtAAgg caaGaGCtCG 
aTgACTCAaa TgAATtTgAA ATAcGCtATt tctGgtAAgg caaGaGCtCG 
aTgACTCAga TgAATtTgAA ATAtGCtATt tctGgaAAgg ctaGaGCtCG 
aTgACTCAgt TaAATcTtAA ATAcGCcATa tctGgtAAgg aacGcGCaCG 
-T- ACTCA-- T-AAT-T-AA ATA-GC -AT- ---G--AA-- ---G-GC-CG 
CV2Bp 

151 200 

cACtGTtGct GGtGTtTCca TacTtagtAC tATGACTggc AGaatgTttC 
cACtGTtGct GGtGTtTCca TacTcagtAC tATGACTggc AGaatgTttC 
cACtGTtGct GGtGTtTCca TacTtagtAC tATGACTggc AGaatgTttC 
cACcGTtGct GGtGTcTCta TtcTcagtAC tATGACTggc AGaatgTttC 
cACtGTtGct GGtGTcTCca TccTtagtAC tATGACTggc AGaatgTttC 


IBV tACaGTgGca GGtGTgTCta TccTttctAC tATGACTaat AGgcagTttC 
TCV tACaGTgGca GGtGTgTCta TccTttccAC tATGACTaat AGgcaaTttC 


FIPV tACaGTaGga GGaGTtTCac TtcTttctAC cATGACTacg AGacaaTacC 
TGEV tACaGTaGga GGaGTtTCac TtcTttctAC cATGACTacg AGacaaTatC 
CCV tACaGTaGga GGaGTtTCac TtcTttctAC cATGACTacg AGacaaTacC 
229E tACaGTgGgt GGcGTcTCtt TatTagctAC tATGACTaca AGacagTttC 
Consensus -AC-GT-G-- GG-GT-TC-- T--T-AC -ATGACT--- AG-T--C 


201 

HEV AtCAaAAatg ctTgAAaagt 
BCV AtCAaAAatg ttTgAAaagt 
OC43 AtCAaAAatg ttTgAAaagt 
MHV AtCAaAAgtg ttTaAAgagt 
SDAV AtCAaAAgtg ttTaAAgagt 
IBV AtCAgAAgat tcTtAAgtct 
TCV AtCAgAAgat tcTtAAgtct 

FIPV AtCAgAAgca ttTgAAgtca 
TGEV AtCAgAAgca ttTgAAgtca 
CCV AcCAgAAgca ttTgAAgtca 
229E AtCAgAAatg tcTgAAatcc 
Consensus A-CA-AA- --T-AA- 


250 

ATaGcagctA CacGtggcGt tcCtGTgGTt 
ATaGcagctA CacGtggtGt tcCtGTtGTt 
ATaGcagctA CacGtggtGt tcCtGTaGTt 
ATaGcagctA CtcGtggtGt tcCtGTaGTt 
ATaGcagctA CtcGtggtGt gcCtGTaGTt 
ATaGtcaacA CtaGaaatGc ttCtGTaGTt 
ATaGtcaatA CtaGaaatGc tcCtGTaGTt 

ATtGctgcaA CacGcaatGc caCtGTtGTc 
ATtGctgcaA CacGcaatGc taCtGTgGTc 
ATtGctgcaA CacGcaatGc caCtGTgGTt 
ATaGtagctA CcaGaaatGc caCcGTtGTt 
AT-G-A C--G-G- --C-GT-GT- 


251 

HEV ATaGGcaCcA CtAAaTTtTA 
BCV ATaGGcaCcA CtAAgTTtTA 
OC43 ATaGGcaCcA CtAAaTTtTA 
MHV ATaGGcaCcA CgAAgTTcTA 
SDAV ATaGGcaCcA CgAAgTTtTA 

IBV ATtGGaaCaA CcAAgTTtTA 
TCV ATtGGgaCaA .CcAAgTTtTA 

FIPV ATtGGttCaA CcAAgTTtTA 
TGEV ATtGGttCaA CcAAgTTtTA 
CCV ATtGGctCaA CcAAgTTtTA 
229E ATcGGcaCtA CcAAgTTtTA 
Consensus AT-GG--C-A C-AA-TT-TA 


300 

tGGcGGcTGG GAtgAtATGt TacgccgccT 
tGGcGGcTGG GAtgAtATGt TacgtcgccT 
tGGtGGcTGG GAtgAtATGt TacgccgccT 
cGGcGGtTGG GAtgAtATGt TacgccgccT 
tGGcGGtTGG GAtgAcATGt TacgccgccT 

tGGcGGtTGG GAcaAcATGt TgagaaaccT 
tGGcGGtTGG GAcaAtATGt TgaggaaccT 

tGGtGGtTGG GAcaAcATGc TtaaaaattT 
tGGtGGtTGG GAcaAtATGc TtaaaaattT 
tGGtGGtTGG GAtaAcATGc TtaaaaattT 
tGGcGGgTGG GAtaAtATGt TaaagaaccT 
-GG-GG-TGG GA--A-ATG- T-T 


301 

HEV tATtaaaGaT GTTGAtaatc 
BCV tATtaaaGaT GTTGAtaatc 
OC43 tATtaaaGaT GTTGAcaatc 
MHV tATtaaaGaT GTTGAtagtc 
SDAV tATtaaaGaT GTTGAtagtc 

IBV gATtcagGgT GTTGAagacc 
TCV gATtcaaGgT GTTGAagacc 

FIPV gATgcgtGaT GTTGAtaacg 
TGEV aATgcgtGaT GTTGAtaatg 
CCV aATgcgtGaT GTTGAtaatg 
229E gATggccGaT GTTGAtgatc 
Consensus -AT-G-T GTTGA- 

351 

HEV GTGAtcGtGC taTgCCaaac 
BCV GTGAtcGtGC taTgCCaaac 
OC43 GTGAtcGtGC taTgCCaaac 
MHV GTGAtcGtGC taTgCCaaac 
SDAV GTGAtcGtGC taTgCCaaac 

IBV GTGAtaGaGC aaTgCCtaat 
TCV GTGAtaGaGC aaTgCCaaat 

FIPV GTGAccGtGC ttTaCCtaat 
TGEV GTGAccGtGC ttTaCCtaat 
CCV GTGAccGcGC ttTaCCtaat 
229E GTGAtaGaGC taTgCCctca 
Consensus GTGA --G-GC --T-CC- 


350 

ctgtacTtAT GGGtTGGGAt TATCCaAAgT 
ctgtacTtAT GGGtTGGGAt TATCCtAAgT 
CtgtacTtAT GGGtTGGGAt TATCCtAAgT 
CtgtacTcAT GGGtTGGGAc TATCCtAAaT 
CtgtacTtAT GGGtTGGGAc TATCCtAAgT 

caattcTtAT GGGtTGGGAt TATCCtAAgT 
ctattcTtAT GGGgTGGGAt TATCCtAAgT 

gttgttTgAT GGGgTGGGAc TATCCtAAgT 
gttgttTgAT GGGaTGGGAc TATCCtAAgT 
gttgttTgAT GGGaTGGGAc TATCCtAAgT 
ctaaatTgAT GGGaTGGGAc TATCCtAAgT 
-T-AT GGG-TGG GA- tat cc- a a-t 

CV4Bm 

400 

aTacTacGtA Ttgttagtag tcTggTatTg 
aTacTacGtA Ttgttagtag tcTggTctTg 
cTacTacGtA Ttgttagtag ttTggTatTa 
aTacTgcGtA Tcgttagtag ttTggTgtTa 
aTacTacGtA Ttgttagtag ttTggTgtTa 

tTgtTgcGtA Tagcagcatc ctTagTacTt 
tTgcTacGtA Taacagcatc ttTggTacTt 

aTgaTtaGaA Tggcttctgc caTgaTatTg 
aTgaTtaGaA Tggcttctgc caTgaTatTa 
aTgaTcaGaA Tggcatctgc caTgaTatTa 
aTgaTtcGtA Tgttgtcggc taTgaTctTa 
-T--T--G-A T- --T--T--T- 


Fig. 3. 




C.B. Stephensen et al. /Virus Research 60 (1999) 181-189 


185 


1 

m i 

1, 



123 bp 

OC43 

BCV 


■ MHV 
SDAV 
229E 
FIPV 
TGEV 
CCV 
123 bp 



123 bp 
IBV 
TCV 
RT neg 
pOC43 
PCR neg 
123 bp 


Fig. 4. Polymerase chain reaction (PCR) products for ten 
coronaviruses [OC43, bovine coronavirus (BCV), mouse hep¬ 
atitis virus (MHV), sialodacryoadenitis virus of rats (SDAV), 
229E, feline infectious peritonitis virus (FIPV), transmissible 
gastroenteritis virus (TGEV), canine coronavirus (CCV), infec¬ 
tious bronchitis virus (IBV), turkey coronavirus (TCV)] using 
the consesus PCR primers (2Bp and 4Bm, see text for se¬ 
quence) for the genus Coronavirus. Twenty pi of reaction 
product were run on a 4% agarose gel (NuSieve 3:1, FMC 
BioProducts, Rockland, ME) and stained with 1 pg/ml ethid- 
ium bromide. Also included on the gel were: reaction product 
from PCR using 1 pg of plasmid containing target sequence 
from human coronavirus (HCV)-OC43 as positive control 
(pOC43); reaction products from negative control samples 
(water only) which were carried through both the reverse 
transcriptase (RT) and PCR steps (RT neg) or the PCR step 
alone (PCR neg); 1 pg of 123 bp molecular size standards 
(Bethesda Research Labs, Bethesda, MD). 


teritis virus (TGEV) of swine (Elequet et al., 
1995), two different isolates of a single group 2 
virus, mouse hepatitis virus (MHV) (Pachuk et 
al., 1989; Lee et al., 1991), and the single group 3 
virus, infectious bronchitis virus (IBV) of chickens 
(Boursnell et ah, 1987). Degenerate oligonucle¬ 
otide primers were selected by identifying the 


most conserved regions from the published IBV 
and MHV pol sequences (Boursnell et al., 1987; 
Lee et al., 1991). These primers were used to 
derive clones from three group 1 viruses—feline 
infectious peritonitis virus (FIPV; UCD2 strain 
provided by Nils Pedersen, University of Califor¬ 
nia, Davis), TGEV of swine (provided by David 
Brian, University of Tennessee, Knoxville) and 
canine coronavirus (CCV; 1-71 strain from the 
American Type Culture Collection (ATCC), cata¬ 
log no. VR-809, Rockville, MD), and five group 2 
viruses, hemagglutinating encephalomyelitis virus 
of swine (HEV; ATCC catalog no. VR-741), 
bovine coronavirus (BCV) (provided by David 
Brian), HCV-OC43 (provided by Ortwin Schmidt, 
University of Oklahoma School of Osteopathic 
Medicine, Tulsa), sialodacryoadenitis virus of rats 
(SDAV; provided by Trenton Schoeb, University 
of Florida, GA, from a stock originally derived 
from ATCC) and turkey enteric coronavirus 
(TCV) obtained directly from ATCC (ATCC VR- 
911). Two genome-sense primers were used in the 
PCR reactions. The 5'-most primer was 8p, 5'- 
TATGA(GA)GG(TC)GG(GC)TGTATACC-3', 
the 5' end of which was 52 nucleotides upstream- 
from the second genome-sense primer lAp, 5'- 
GATAAGAGTGC(TA)GGCTA(TC)CC-3'. One 
antigenome-sense primer was used for first-strand 
cDNA synthesis and for the subsequent PCR; 7m, 
5' - ACT AGC ATT GT (AG )T GTTG( AT )G A AC A - 
3'. The region amplified by these primers (lAp/ 
7m) (including the primer sequences) corresponds 
to nucleotides 13 833 through 14 797 of IBV 
(Boursnell et al., 1987) and 15 118 through 16082 
of MHV (Lee et al., 1991). The lAp/7m primer 
combination, which produced a 965 bp product, 
was used for all of the indicated viruses except for 
HEV and TCV. For these viruses, the 8p/7m 
primer combination was used, which produced a 
1013 bp product. The 922 nucleotides internal to 
the lAp/7m primers (919 in the case of IBV and 
TCV, which contain a three nucleotide deletion) 
were sequenced and analysed. These primers were 
also used in an attempt to characterize the puta¬ 
tive rabbit coronavirus (RbCV), which was first 
described from rabbits with pleural effusion dis¬ 
ease and has tentatively been considered a coro¬ 
navirus (Small et al., 1979). However, this 
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classification is not definitive (Siddell, 1995) and 
the virus is poorly characterized. The primer pairs 
(lAp/7m, 8p/7m) did not amplify any identifiable 
sequences from a standard infectious serum 
(ATCC VR-920) derived from a rabbit with 
pleural effusion disease. 

Viral RNA was prepared (Chomczynski and 
Sacchi, 1987) from tissue culture supernatants or 
cellular extracts. First strand cDNA was synthe¬ 
sized using avian myeloblastoma virus reverse 
transcriptase (RT, Promega, Madison, WI) or 
Maloney murine leukemia virus RT (Superscript! 
or II, Bethesda Research Laboratories, Bethesda, 
MD). PCR was performed with 0.25 pM primers, 
from 0.025 to 0.04 U/pl Taq polymerase 
(Promega), manufacturer’s buffer containing 1.5 
mM Mg, and deoxynucleotide triphosphates (0.1 
mM each). PCR profiles involved an initial denat- 
uration for 1 min at 98°C followed by 32-40 
cycles of annealing at 45°C for 1 to 2 min, exten¬ 
sion at 72°C for 1 min, and melting at 94°C for 1 
min. In some cases, the final 20 cycles were per¬ 
formed using a 50°C annealing temperature. Am¬ 
plification products were subcloned into the 
pCRIOOO or 2000 vector using the TA cloning 
system (Invitrogen, San Diego, CA). Inserts were 
sequenced completely in both directions with Se- 
quenase 2.0 (US Biochemical, Cleveland, OH), 
plasmid region primers, the PCR primers, and 
additional sequencing primers (not shown). Se¬ 
quence alignment was performed using the Lineup 
and Pileup programs from the Genetics Computer 
Group software (Devereux et al., 1984). 

The deduced amino acid sequences for this 
region of ORF lb of the pol gene for the 11 
coronaviruses align precisely (Fig. 1) and corre¬ 
spond to the highly conserved region surrounding 
the SDD or GDD polymerase motif common to 
viral RNA-dependent polymerases (Poch et al., 
1989). The only gaps in the alignment are at¬ 
tributable to a single amino acid deletion at posi¬ 
tion 16 in both IBV and TCV. All coronaviruses 
show the SDD sequence at the putative active site 
of the polymerase, except TCV, which, unusually 
for a viral RNA-dependent RNA polymerase, has 
an ADD sequence. The percent amino acid and 
nucleotide sequence identities among these 11 
viruses are shown in Table 1 and reveal identities 


which are similar to the patterns described by the 
three groups, with the single exception that the 
TCV sequence is much more similar to IBV than 
to any other coronavirus. For example, within the 
group 2 cluster of five viruses the maximum sub¬ 
stitution frequency is 16/100 nucleotides (compar¬ 
ing MHV to HEV) while among the four group 1 
viruses the frequency among CCV, FIPV and 
CCV is < 5/100 nucleotides. However, 229E dif¬ 
fers from these three by an average of 36 substitu¬ 
tions/100 nucleotides, which is consistent with the 
weaker antigenic relationship of 229E to these 
viruses (Sanchez et al., 1990). The TCV sequence 
is very similar to IBV, showing a substitution 
frequency of only 7.2/100 nucleotides, clearly sug¬ 
gesting that these viruses should fall within the 
same group. 

To further characterize the phylogenetic rela¬ 
tionships among these viruses, a dendogram was 
created with PAUP (version 3.0) using the maxi¬ 
mum parsimony method. A branch and bond 
algorithm was used to identify the single most 
parsimonious tree. Only one tree was identified. 
The three nucleotides missing in the IBV and 
TCV sequences (which represent a single amino 
acid deletion) were each treated as a separate 
character state rather than as missing data. The 
resulting unrooted tree is shown in Fig. 2. The 
consistency index of the tree was 0.818 and the 
rescale consistency index was 0.711. Bootstrap 
analysis was also performed and the resulting 
values are shown at branch points in the figure. 
An identical tree and essentially identical boot¬ 
strap values were also derived using the Kimura 
two-parameter method for calculating distances 
and the neighbor-joining method to construct the 
tree (using the Clustal W program). Again, this 
analysis reveals that published IBV sequence and 
the TCV sequence presented here are very closely 
related. In addition, the three group I viruses 
FIPV, TGEV and CCV are found on a common 
branch with HCV-229E being more distantly re¬ 
lated. The group 2 viruses fall into two groupings, 
with SDAV and MHV being closely related to 
one another and the remaining three viruses in 
this group—HCV-OC43, BCV and HEV forming 
a separate branch. 



C.B. Stephensen et al. / Virus Research 60 (1999) 181-189 


187 


This phylogenetic analysis conforms closely to 
results from antigenic studies of these coro- 
naviruses, with the single exception that the analy¬ 
sis indicates that TCV and IBV are closely related 
viruses. Coronaviruses have traditionally been di¬ 
vided into three groups (Sturman and Holmes, 
1983; Siddell, 1995), including two groups of pri¬ 
marily mammalian coronaviruses (groups 1 and 2, 
although TCV was recently included in group 2, 
Siddell, 1995) and a separate, single-member 
group for the avian coronavirus IBV (group 3). 
The antigenic characterization of the second avian 
coronavirus, TCV, has been controversial. Sero¬ 
logic studies (Dea and Tijssen, 1989; Dea et al., 
1990) and sequence analysis of the N and M genes 
(Verbeek and Tijssen, 1991) from a cell culture- 
adapted clone of the Minnesota strain of TCV 
indicate that TCV is closely related to the group 2 
mammalian coronaviruses, particularly BCV and 
HCV-OC43. However, recent serologic studies 
with both polyclonal and monoclonal antibodies 
(Guy et al., 1997) indicate that the Minnesota 
strain of TCV, as well as additional field isolates 
of TCV, are close antigenic relatives of IBV. The 
data agree with this latter conclusion. Since the 
pol gene product is not involved in the determina¬ 
tion of antigenic cross-reactivity among viruses, 
the data do not directly address the discrepancy 
between the results of Guy et al. (1997) and Dea 
et al. (1990), but do indicate that further work is 
necessary to resolve the contradictory finding with 
regard to the characterization of TCV. 

A goal of the sequence analysis described above 
was to identify conserved regions which could be 
targeted for the development of a consensus PCR 
assay for the genus Coronavirus. Since neither 
primer pair used in cloning these pol gene regions 
(lAp/7m or 8p/7m) detected all 11 coronaviruses 
used in this study, the 922 (919 in the case of IBV 
and TCV) nucleotide region internal to the lAp/ 
7Bm primers was compared to identify regions 
with greater sequence identity. As shown in Fig. 
3, two regions were selected to serve as targets for 
two degenerate oligonucleotide primers: primer 
2Bp, 5 '-ACT C A( A/G)(A/T)T (A/G) A AT (T / 

C)TNAAATA(T/C)GC-3'; and primer 4Bm, 5'- 
TCACA(C/T)TT(A/T)GGATA(G/A)TCCCA-3'. 
After testing different reaction conditions, a pro¬ 


tocol was selected in which the RT and PCR 
portions of the assay were performed essentially 
as described above, using the 4Bm oligonucle¬ 
otide to prime cDNA synthesis. Annealing condi¬ 
tions during the PCR assay were also modified 
slightly from those described above, namely: in 
the first five cycles the annealing temperature was 
40°C (2 min), followed by 35 cycles at 50°C (1.5 
min). The sensitivity of this protocol was tested 
using a plasmid containing the 965 bp HCV- 
OC43 pol sequence. The limit of detection for this 
plasmid on an ethidium bromide-stained gel was 
6000 plasmid copies (data not shown). Then this 
assay was tested on representative coronaviruses 
from each group. As shown in Fig. 4, these 
primers amplified the expected 251 bp region in 
four group 1 viruses (229E, FIPV, TGEV, CCV), 
four group 2 viruses (OC43, BCV, MHV, 
SDAV), the single, currently recognized, group 3 
virus (IBV), and TCV, which is currently placed 
in group 2. In addition, these primers detected a 
fifth group 2 virus, HEV (data not shown). After 
repeated attempts, these primers did not detect 
the pol target sequence in infectious serum from a 
rabbit with pleural effusion disease (containing 
4 x 10 5 rabbit infectious units; ATCC VR-920). 
Thus this assay will detect all ten of the well- 
characterized coronaviruses studied here, will also 
detect TCV, but will not detect the putative 
RbCV. This result suggests that the putative 
RbCV is not a member of the genus Coronavirus. 
However, slight variations in the target sequences 
for these primers, or a lack of sensitivity of this 
assay, could also explain this negative result. 

Coronaviruses infect a variety of animal hosts 
and many uncharacterized coronaviruses have 
been implicated in a variety of diseases, particu¬ 
larly enteric (Resta et al., 1985; Guy et al., 1997) 
and respiratory (Myint, 1995) infections. The 
consensus PCR approach described here has al¬ 
ready provided novel information on the identity 
of one little-studied coronavirus (TCV), suggest¬ 
ing that it should be classified with IBV in group 
3. In the future, this consensus PCR approach 
should prove useful in identifying and character¬ 
izing additional members of the genus Coro¬ 
navirus. 
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